MG-155: Proposal to add custom must-gather image option in MustGather Spec.#1906
Conversation
|
@shivprakashmuley: This pull request references MG-155 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@shivprakashmuley: This pull request references MG-155 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@shivprakashmuley: This pull request references MG-155 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
952fbd3 to
32a4048
Compare
32a4048 to
ca603c4
Compare
There was a problem hiding this comment.
is this a well defined imagestream with a pre-defined name we agree upon? how do we know which imagestream to look for? or are we assuming there will be only one imagestream, so we just pick that up and use it?
There was a problem hiding this comment.
We can keep the parameter ImageStreamTag in <ImageStream_name>:<Tag_name> format.
Multiple ImageStreams can be created in the operator namespace.
By parsing ImageStreamTag we can get hold of which ImageStream to look for and the respective tag.
There was a problem hiding this comment.
should we allow multiple custom images to be specified?
There was a problem hiding this comment.
We will allow one custom image per CR. It will be easier to maintain the success/failures of each image run.
There was a problem hiding this comment.
this will look for an imagestream tag in an imagestream in the namespace of the CR. how do we map it to fetch the imagestream tag from the must-gather-operator namespace?
There was a problem hiding this comment.
We can recommend to create all imagestreams in operator namespace only. Operator (in reconcile logic) will only look for imagestreams in operator namespace.
Please let me know if you see issues with this approach.
There was a problem hiding this comment.
so operator will resolve the imagestream tag and use that image to create the pod?
There was a problem hiding this comment.
yes. Operator will resolve imagestream tag, fetch corresponding image - if valid - create the must-gather pod with that image.
| // +kubebuilder:validation:Optional | ||
| // ImageStreamTag is the new field to specify a custom image from the allowlist. | ||
| ImageStreamTag string `json:"imageStreamTag,omitempty"` | ||
|
|
There was a problem hiding this comment.
we need to factor in some field for letting users choose some custom arguments for the custom must-gather images that they trigger.
There was a problem hiding this comment.
it's mainly so we can conclusively support these two use cases:
- IIRC, in the past enhancement discussion we wanted to introduce an
additionalConfigstruct that allows customizing the default platform must-gather into running/usr/bin/gather_audit - HyperShift custom must-gather dumping i.e. SREP-1884: Add support for ACM Hosted Control Plane (HCP) must-gathers must-gather-operator#296 (comment) requires the ability to specify arbitrary config value for
HostedClusterOptions{Namespace, Name}, etc.
There was a problem hiding this comment.
Explicitly for audit or /usr/bin/gather_sudit_logs we have a separate API field in the spec.
https://github.com/openshift/must-gather-operator/blob/master/controllers/mustgather/template.go#L29
https://github.com/openshift/must-gather-operator/blob/master/controllers/mustgather/template.go#L27
HyperShift custom must-gather dumping i.e. openshift/must-gather-operator#296 (comment) requires the ability to specify arbitrary config value for HostedClusterOptions{Namespace, Name}, etc.
yes we can add a new struct additionalConfig for any other custom arguments.
Should we do that as part of the same enhancement or a new one?
There was a problem hiding this comment.
Sounds reasonable IMHO to add this in the same enhancement, as I'd assume many custom must-gather images could benefit from custom args.
There was a problem hiding this comment.
now with the introduction of imageStreamTag, how is the default platform must-gather image (that we ship/referred to be known in our operator) handled?
There was a problem hiding this comment.
also, am assuming we treat the "default platform must-gather image" as an exception and don't change the behavior for it.
Still we need to support the images in https://github.com/openshift/oc/blob/main/pkg/cli/admin/mustgather/mustgather.go#L166 i.e. the custom must-gather images for each operator that is installed on the cluster and that ships a well trusted gather image. It can be very cumbersome for a cluster admin to have an imageStreamTag against each operator image especially when operators are updated the tags/images can change.
There was a problem hiding this comment.
also, am assuming we treat the "default platform must-gather image" as an exception and don't change the behavior for it.
imageStreamTag will be an optional parameter. If user does not specify it, operator will fall back to default must-gather image.
There was a problem hiding this comment.
Still we need to support the images in https://github.com/openshift/oc/blob/main/pkg/cli/admin/mustgather/mustgather.go#L166 i.e. the custom must-gather images for each operator that is installed on the cluster and that ships a well trusted gather image. It can be very cumbersome for a cluster admin to have an imageStreamTag against each operator image especially when operators are updated the tags/images can change.
There are some ways we can tackle this:
- For this we can introduce another API field, similar
--all-images. If that is set operator will look for the must-gather annotation and createimagestreams for those images instead of an admin. - As part of operator initialization/deployment operator scans and discovers these annotations and create separate imagestream or create a single imagestream and add these operator-must-gather images as a unique tag under one imagestream.
What we will save by this is, admin does not have toc reate imagestream(s) for these operator-must-gather images.
There was a problem hiding this comment.
Still we need to support the images in https://github.com/openshift/oc/blob/main/pkg/cli/admin/mustgather/mustgather.go#L166 i.e. the custom must-gather images for each operator that is installed on the cluster and that ships a well trusted gather image. It can be very cumbersome for a cluster admin to have an imageStreamTag against each operator image especially when operators are updated the tags/images can change.
There are some ways we can tackle this:
- For this we can introduce another API field, similar
--all-images. If that is set operator will look for the must-gather annotation and createimagestreams for those images instead of an admin.- As part of operator initialization/deployment operator scans and discovers these annotations and create separate imagestream or create a single imagestream and add these operator-must-gather images as a unique tag under one imagestream.
What we will save by this is, admin does not have toc reate imagestream(s) for these operator-must-gather images.
Can we first find out if there is a need to do this? i.e, do current users of the operator actually require this feature. I would like to start with just supporting collection of must-gather through specifying the custom images explicitly
5a9bb15 to
12cd8ba
Compare
There was a problem hiding this comment.
imageStreamRef with {Name, Tag, ..} would be more appropriate.
There was a problem hiding this comment.
or generally, a LocalObjectReference is a good choice for this type of references as this keep future extensibility options open.
There was a problem hiding this comment.
imageStreamRef
(eg. our CI release contoller uses imageStreamRef)
There was a problem hiding this comment.
Thanks for the suggestion @swghosh. I have included it in the proposal now.
There was a problem hiding this comment.
| # Enhancement Proposal: Custom Must-Gather Images | |
| # Custom must-gather Images for Support Log Gather operator |
| type AdditionalConfig struct { | ||
| // +kubebuilder:validation:Optional | ||
| // Metrics is an example field that could specify whether to collect Prometheus metrics. | ||
| Metrics bool `json:"metrics,omitempty"` |
There was a problem hiding this comment.
as the intent here was to support arbitrary parameter values, would you think we should opt for an unstructured field instead?
eg. runtime.RawExtension to allow any arbitrary key structs being placed in here.
There was a problem hiding this comment.
My understanding was, if we want to add/support any specific functionality from operator side (for which changes are needed in operator space) will go under AdditionalConfig, in this case then I felt structured field would make sense. For custom args anyway we will have args parameter, that can have any value depending the custom image requirements.
Do you feel we should support custom/arbitrary fields under additionalConfig as well? @swghosh
There was a problem hiding this comment.
IMHO sounds fair, we can take that approach. But in case we fully want to support the least customisation on the custom gather images, I guess we should atleast provide for
struct{ GatherCommand string, Args []string }
that way we won't need to jump to arbitrary unstructured immediately, yet provide room for enough customisation (args and command is something our oc adm must-gather cli already provides).
There was a problem hiding this comment.
| Command *GatherCommand `json:"command,omitempty"` | |
| Command *GatherCommand `json:"gatherCommand,omitempty"` |
|
/lgtm |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: swghosh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@shivprakashmuley: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR introduces an enhancement proposal to support the use of custom must-gather images with the must-gather-operator. This feature allows cluster administrators to define an allowlist of trusted images that can be used for diagnostic data collection, providing a secure and flexible way to gather specialized information.
Summary:
The proposal introduces two main API changes:
A new
MustGatherImage CRD: This cluster-scoped resource acts as an allowlist for custom must-gather images. Cluster administrators can manage this resource to control which images are permitted to run in the cluster.An update to the
MustGather CRD: The MustGather CRD is extended with an optional mustGatherImage field. When creating a MustGather resource, users can specify an image from the allowlist to be used for the data collection job.The must-gather-operator's logic is updated to validate the mustGatherImage against the MustGatherImage allowlist. If the image is valid, the operator will use it to run the must-gather job. If the image is not in the allowlist, or if the allowlist is not configured, the MustGather resource's status will be updated with an error. If no custom image is specified, the operator will use the default must-gather image, ensuring backward compatibility.
User-Facing Changes:
Cluster administrators can now create and manage a
MustGatherImageresource to control the use of custom must-gather images.Users can specify a custom must-gather image in the
MustGather CR, provided it is in the allowlist.JIRA tracker:
https://issues.redhat.com/browse/MG-155